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METHOD AND APPARATUS FOR MULTIMEDIA EDITING 
Field of the Invention 

The present invention relates to multimedia and video and audio editing and, in 
particular: to the method of automated or semi-automated production of multimedia, 
5 video or audio from input content through the application of templates; and also to the 
method of directing, controlling or otherwise affecting the application of templates and 
production of multimedia, video or audio through use of information about the input 
content. 

Background to the Invention 

10 Techniques and tools exist for the editing, post-production and also creation of 

multimedia and video and audio productions or presentations. These techniques and tools 
have traditionally developed in the movie and video industries where sufficient finances 
and expertise have allowed and directed development of highly flexible tools but which 
require considerable planning and expertise and often multi-disciplinary expertise in order 
15 to complete a production at all, let alone to a standard level of quality. 

Over time these tools have been simplified and reduced in capability and cost 
and several examples are now available in the consumer and hobbyist marketplace, 
typically for use on home computers and. often requiring significant investment in 
computer storage, system performance, accelerator or rendering hardware and the like. 
20 Typically, any one tool is insufficient to complete a product or to complete a production 
to the required standard, therefore requiring investment in several tools. Furthermore, 
these tools are configured to require sufficient expertise to understand them and there is 
also a requirement to learn how to use the techniques. That is, the user must have or gain 
some expertise in the various disciplines within multimedia, video and audio post- 
25 production. The state-of-the-art tools do not typically provide such expertise. 
Furthermore, there is a known requirement for collaboration of the multi-disciplined team 
tasked with creating a multimedia, video or audio production. Said collaboration is 
typically a complex process and those unskilled in the art but wishing to create such a 
production find it difficult or impossible to achieve. 
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It is an object of the present invention to ameliorate one or more disadvantages 
of the prior art. 

Summary of the Invention 

According to a first aspect of the invention, there is provided a method of 
5 processing at least one data set of multi-media input information, said data set comprising 
at least one of video data, still-image data, and audio data, the method comprising the 
steps of: 

determining first meta-data from at least one of said data set, and second meta- 
data associated with said at least one data set; 
10 determining, depending upon the first meta-data, a set of instructions from a 

template; and 

applying the instructions to the input data set to produce processed output data. 

According to a second aspect of the invention, there is provided a method of 
processing at least one data set of multi-media input information, said data set comprising 
15 at least one of video data, still-image data, and audio data, the method comprising the 
steps of: 

determining first meta-data from at least one of said data set, and second meta- 
data associated with said at least one data set; and 

determining, depending upon the first meta-data, a set of instructions from a 
20 template. 

According to a third aspect of the invention, there is provided a method of 
processing at least one data set of multi-media input information, said data set comprising 
at least one of video data, still-image data, and audio data, the method comprising the 
steps of: 

25 applying a template to the input data set, whereby the template comprises a 

temporal mapping process, and whereby the template is constructed using heuristic 
incorporation of experiential information of an expert, and whereby the applying step 
comprises the sub-step of; 
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applying the temporal mapping process to the input data set to produce modified 
temporally structured processed output data. 

According to a fourth aspect of the invention, there is provided a method of 
processing at least one data set of multi-media input information, said data set comprising 
5 at least one of video data, still-image data, and audio data, the method comprising the 
steps of: 

applying a template to the input data set, whereby the template comprises at least 
each of a temporal mapping process and an effects mapping process, and whereby the 
template is constructed using heuristic incorporation of experiential information of an 
10 expert, and whereby the applying step comprises the sub-steps of; 

applying the temporal mapping process to the input data set to produce modified 
temporally structured data; and 

applying the effects mapping process to the modified temporally structured data 
to produce the processed output data. 
15 According to a fifth aspect of the invention, there is provided an apparatus for 

processing at least one data set of multi-media input information, said data set comprising 
at least one of video data, still-image data, and audio data, the apparatus comprising; 

capture means adapted to capture the input data set; 

first determining means for determining first meta-data from at least one of said 
20 data set, and second meta-data associated with said at least one data set; 

second determining means for determining, depending upon the first meta-data, a 
set of instructions from a template; and 

application means for applying the instructions to the input data set to produce 
processed output data, wherein said first and second determination means, and said 
25 application means are housed on board the capture means. 

According to a sixth aspect of the invention, there is provided an apparatus for 
processing at least one data set of multi-media input information, said data set comprising 
at least one of video data, still-image data, and audio data, the apparatus comprising; 

capture means adapted to capture the input data set; 
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first determining means for determining first meta-data from at least one of said 
data set, and second meta-data associated with said at least one data set; 

second determining means for determining, depending upon the first meta-data, a 
set of instructions from a template; and 
5 application means for applying the instructions to the input data set to produce 

processed output data, wherein said first and second determination means, and said 
application means are distributed between the capture means and an off-board processor. 

According to a seventh aspect of the invention, there is provided a computer 
readable memory medium for storing a program for apparatus which processes at least 
10 one data set of multi-media input information, said data set comprising at least one of 
video data, still-image data, and audio data, the program comprising; 

code for a first determining step for determining first meta-data from at least one 
of said data set, and second meta-data associated with said at least one data set; 

code for a second determining step for determining, depending upon the first 
15 meta-data, a set of instructions from a template; and 

code for an applying step for applying the instructions to the input data set to 
produce processed output data. 

According to a eighth aspect of the invention, there is provided a computer 
readable memory medium for storing a program for apparatus which processes at least 
20 one data set of multi-media input information, said data set comprising at least one of 
video data, still-image data, and audio data, the program comprising; 

code for a first determining step for determining first meta-data from at least one 
of said data set, and second meta-data associated with said at least one data set; and 

code for a second determining step for determining, depending upon the first 
25 meta-data, a set of instructions from a template. 



Brief Description of the Drawings 

A number of preferred embodiments of the present invention will now be 
described with reference to the drawings in which: 
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Fig. 1 depicts a typical application of derived movie-making techniques; 
Fig. 2 shows a first example of a temporal structure mapping process; 
Fig. 3 show a second example of a temporal structure mapping process; 
Fig. 4 depicts mapping process steps in more detail; 
5 Fig. 5 illustrates an application relating to post production processing; 

Figs. 6A and 6B illustrate incorporation of user-interaction; and 
Fig. 7 depicts a preferred embodiment of apparatus upon which the multi-media 
editing processes may be practiced; 

Table 1 presents preferred examples of the selection and extraction process; 
10 Table 2 illustrates preferred examples for the ordering process; 

Table 3 presents preferred examples for the assembly process; 
Table 4 illustrates examples of effects mapping; 
Table 5 depicts a template for a silent movie; and 

Table 6 illustrates associations between editing and effect techniques and 
15 template type. 

Appendix 1 presents a pseudo-code representation of a movie director module; 
Appendix 2 presents a pseudo-code representation of a movie builder example; 

and 

Appendix 3 illustrates a typical template in pseudo-code for an action movie; 

20 Detailed Description 

First Preferred Embodiment of the Method 

Some of the typically poor features of consumer video, that are typically visible 
or obvious or encountered during presentation of said consumer video, may be reduced in 
effect or partially counteracted by automatic application of techniques derived from, or 

25 substituting for, typical movie-making or video-making techniques. These derived, or 
substitute, techniques can include techniques that are relatively unsophisticated compared 
to those typically applied in the movie-making industry. Furthermore, these relatively 
unsophisticated techniques, upon application to a consumer video or multimedia 
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recording or presentation can provide a positive benefit to the said video recording or 
multimedia presentation, or parts thereof. 

Fig. 1 indicates an example system for automatic application of derived or 
substitute movie-making techniques to an input source content, typically a consumer 
5 audio/video recording or multimedia recording or presentation. Said derived or substitute 
movie-making techniques are, or may be typically applied in two sequential steps to the 
said input source content, as shown in Fig. 1, resulting in the output of processed content 
that is presented or played or stored in preference to or in replacement of the said input 
source content. The intention of the embodiment is to provide a user with an improved 
10 presentation or recording by offering the user the possibility of using the output 
(processed) content in substitution for the input source content. Said improvements may 
include or may only include reductions in poor quality features of the said input source 
content. 

It is the intention of the embodiment to operate on or process the provided input 
15 source content and not to require or force the user to provide or source or record 
additional or replacement or improved input source content in order to effect the quality 
improvement or reduction in poor features claimed for the invention. The embodiment is, 
however, not restricted from utilising other, or additional, or replacement input source 
content or other content sources in order to achieve the stated goal of improvement of 
20 quality or reduction of poor features in the output content, in comparison with the input 
source content. 

The process indicated in Fig. 1 includes steps: 101, input of source content; 102, 
automatic application of temporal structure mapping; 103, automatic application of 
effects mapping; and 104, output of processed content. 
25 Fig. 1 step 101 may involve any reasonable method known in the art for input or 

capture of source content into a form suitable for subsequent processing by an automatic 
hardware or software or combined system, typically a general purpose computer system as 
described below with reference to Fig. 7. Such methods may include digitisation of 
analogue data, or may include reading a digitised serial data stream, for instance, sourced 
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by a Camcorder or DVCamcorder device, and may include format conversion or 
compression or decompression, or combinations thereof as well as, typically, writing of 
the digitised, converted or reformatted data, 1 1 1, to a suitable storage medium within the 
aforementioned automatic system ready for subsequent processing at 102. 
5 Fig. 1 step 102 involves the optional mapping of the temporal structure, or part 

thereof, of the stored input source content to a new temporal structure, or part thereof, 
output at 112. Step 102 is a processing step involving modification of the temporal 
structure of said input content, 111, to obtain said output content, 112, where said 
mapping process may include the reduced case of identity mapping (that is, no changes 

10 are introduced). Typically, more useful mapping processes, singular, or plural, may be 
involved in Fig. 1 step 102 and these are described herein as both preferred embodiments 
or parts thereof of the invention as well as being examples, without restriction, of 
embodiment of the invention. 

A first example of a useful temporal structure mapping process that may be 

15 implemented in Fig. 1 step 102, is shown diagrammatically in Fig. 2. This first example 
mapping process involves the reduction of the duration of the content 111, from Fig. 1 
step 101 when mapped to the output content, 112, as well as a consequent overall 
reduction in the duration of the entire content presentation. The example in Fig. 2 
assumes a retention of chronological ordering of the input source content, 111, when 

20 mapped to the output content, 112. The input content comprises one whole temporal 
element, 201, about which little or nothing may be known by the automatic system 
regarding the temporal structure other than, typically, the total duration of the content, 
111. This content may typically comprise video and audio information previously 
recorded and now provided by the user, as well as possibly including still images and 

25 other multimedia elements. This content may even have been previously processed or 
even artificially generated, in which case a variety of content types may be included. In 
this first example of the temporal structure mapping process, 250, the automatic system 
may select portions of the input content and reassemble these. The important feature of 
the mapping process in this example is the reduction of overall duration of the output 
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content, 112, in comparison with the duration of the input content, 111. This automatic 
reduction of duration of source content can be one of several significant techniques for 
reducing the poor features of said source content or, conversely, for increasing the 
positive perceptual quality of said source content when it is presented to viewers. The 
5 mapping process, 250, in Fig. 2, in this first example, may typically comprise steps of: 
selection and extraction Fig. 4, 401, of a number of content portions, for instance, 261, 
262, 263, from 201, which is a timeline representation of input content 111 in Fig. 1; 
ordering Fig. 4, 402, of content portions, 261, 262, 263, which in this first example 
involves retention of the same sequencing or chronology of the extracted portions as was 

10 present in 201; and assembly Fig. 4, 403, of extracted portions 261, 262, 263, to yield the 
output content, 112, shown in Fig. 2 in a timeline representation, 290. 

More complex mapping processes, 250, are possible, potentially yielding better 
results, or a greater probability of better results than the first example already described. 
For instance, a second example, shown in Fig. 3, may involve more knowledge of the 

15 temporal structure of the input content, 11 1, in the mapping process, 250, to yield a better 
result, 112, or an improved probability of a better result at 112. For instance, when the 
automatic system applies selection and extraction step 401 to the input content in Fig. 3, it 
may have the benefit of some information about the temporal structure of the input 
content. In Fig. 3 an example temporal structure is shown in which the input content 

20 comprises five consecutive portions, 301, 302, 303, 304, 305, labelled Clip 1, Clip 2, Clip 
3, Clip 4, and Clip 5, respectively. Information concerning the duration of these clips may 
be available with the input content or may be measured in standard ways by the automatic 
system. The selection and extraction step, 401, now has the opportunity to perform one or 
more of a variety of functions or algorithms utilising this available or measured temporal 

25 structure information to select and extract a portion or portions from the input content. A 
list of preferred examples for selection and extraction step 401 are given in Table 1 and 
these are provided without restriction on the possible methods of performing step 401. A 
selection and extraction step may be obtained from Table 1 by combining any example 
from each column, of which, not all combinations need be useful. Step 402 of the 
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mapping process, 250, may provide a greater variety of ordering methods and/or greater 
predictability or control of ordering methods if access to information about the temporal 
structure of the input content, 111, is available and/or if information about the temporal 
attributes of the selection and extraction process 401 relative to the temporal structure of 
5 the input content is available. The ordering step, 402, now has the opportunity to perform 
one or more of a variety of functions or algorithms utilising this available or temporal 
structure information to order portions previously selected and extracted from the input 
content. A selection of preferred examples for ordering step 402 are listed in Table 2 and 
these are provided without restriction on the possible methods of performing step 402. 

10 Step 403 of the mapping process, 250, may provide a greater variety of assembly methods 
and/or greater predictability or control of assembly methods if access to information about 
the temporal structure of the input content, 111, is available and/or if information about 
the temporal attributes of the selection and extraction process 401 relative to the temporal 
structure of the input content is available and/or if information about the ordering process 

15 402 relative to the temporal structure of the input content is available. The assembly step, 
403, now has the opportunity to perform one or more of a variety of functions or 
algorithms or assembly methods utilising this available or temporal structure information 
to assemble portions previously selected and extracted from the input content and 
consequently ordered. A selection of preferred examples for assembly step 403 are listed 

20 in Table 3 and these are provided without restriction on the possible methods of 
performing step 403. 

In the simplest of mapping process methods, 250, related or synchronised or 
coincident audio and video data, for example, may be treated similarly. However, there 
are known techniques, some of which may be automated, in the movie-making and video- 

25 making industries for treating audio near video transitions or vice- versa to retain or obtain 
best quality results and these techniques may be employed in mapping process 250. 

Following structure mapping, 102, is effects mapping, 103, in Fig. 1. The output 
content, 112, from the structure mapping process, 102, has effect mapping performed 
automatically on it, resulting in output content, 113. In the simplest case, effects 
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mapping, 103, may be the identity case, in which the input content, 1 12, is unchanged and 
output at 113. Typically, however, one or more of a variety of effects may be 
automatically applied at 103, to either or both the audio and video content, for example, 
within content 112. These effects may include processes or functions or algorithms well- 
5 known in the art and table 4 provides an example list of effects. A variance in the order in 
which effects are applied to the same content typically results in different output content 
and therefore, the particular ordering of effects applied to content 112, may also be 
considered an effect. Effects may be applied without knowledge of the temporal structure 
mapping process nor of the input content's temporal structure at 1 1 1, in which case it may 

10 be typical to apply an effect uniformly to the whole content at 112. Alternatively, some 
effects may be applied with knowledge of the input content's temporal structure, or with 
knowledge of the temporal mapping process at 102, and typically, such effects may be 
applied to a portion or portions of the content, 112. 

In the first embodiment, temporal mapping and effects mapping are, or may be, 

15 applied automatically to input content to produce output content that may have poor 
features reduced or improvement of quality or both for the purpose of improving the 
perceptual experience of a viewer or viewers. The first embodiment describes said 
example or examples in which minimal information is available to the embodiment about 
the input content, amounting to information about the content's total duration or perhaps 

20 information about the content's segmentation and clip duration and sequence (or 
chronology) and without direction or input or control by the user other than to select the 
entirety of the input content for application to the embodiment. Furthermore, the first 
embodiment of the invention may not include user control of, or selection of, temporal 
structure mapping functions or parameters nor of effects mapping functions or parameters. 

25 Further, the specific temporal mapping function or functions and effects mapping 
functions exercised by the embodiment may be automatically selected without user 
control and without the benefit of additional, extra or external information or analysis 
concerning the content and yet the embodiment is capable of producing successful results 
as may be perceived by a viewer of the output content, 113. This fact is a likely result of 
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the expectations and critical faculties and the like of the viewer as applied to the output 
content. Thus, it may be said that the first embodiment of the invention effectively 
provides a random and largely unrelated set of temporal structure mapping and effects 
mapping processes for application to input content with some probability of the output 
5 content being perceived as improved or reduced of poor features by a viewer. 

The temporal mapping process and the effects mapping process may be described 
as being, or as being part of 5 or as obeying rules or rule-sets where the rules or rule-sets 
may include these properties or relations or information or entities: explicit declaration or 
implementation or execution of functions or algorithms or methods for performing the 

10 structure mapping and/or effects mapping processes and potentially other processes; 
references to said functions, algorithms or methods, where the actual functions or 
algorithms or methods may be stored elsewhere, such as in a computer system memory or 
on a medium such as a hard disk or removable medium or even in another rule-set; 
possible relations or associations or attribute or parameter passing methods for controlling 

15 or specifying information-passing or dependencies between functions or algorithms or 
methods or even between rules or rule-sets; rules or rule-sets specifying methods of 
selecting which temporal structure mappings and effects mappings will be executed or 
implemented in any particular application of the embodiment; ordering and/or repetition 
information for the application of mappings or of rules or rule-sets; heuristics information 

20 or information derived heuristically for direction or control of any portion of said rule-set 
and related information. 

A collection of rules or rule-sets and any of the aforementioned properties or 
relations that may be included in or with a rule-set may be collectively described as a 
template. The act of application of the embodiment to input content, as previously 

25 described by means of application of temporal mapping or mappings and/or effects 
mapping or mappings to said input content and the associated relations or dependencies 
between these mappings, may be described as application of a template identically 
describing said application of said mappings and rules or rule-sets to input content to 
derive said output content. 
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A further example of the first embodiment involves application of the template 
to multiple input content to achieve one output content. Various types of input content 
may be accepted including audio, video, graphics, still images, clipart, animations, video 
keyframes or sequences thereof, mattes, live video source, for instance from a camcorder 
5 or DVCamcorder, or multiple sources of each of these, or combinations of these per input 
content source. In this example, the embodiment, for the purposes of the mapping 
processes, may treat each input content as a portion of the sum of the entire input content 
applied. Further, this embodiment of the invention may or may not have information 
about the relative sequence or chronology of some or all of the multiple input content 

10 sources, if this is relevant or practical. Several practical applications of this example 
exist, including a personal content display device, perhaps mounted as an electronic 
picture frame in which images or video and/or audio, etc may be displayed automatically 
by the embodiment. In this application, the input content may have been previously stored 
in a memory or on media such as a hard disk drive. Another practical application for this 

15 embodiment of the invention may be as an entertainment device for social occasions such 
as for parties, in which the embodiment may display output processed from multiple input 
content sources, possibly including live audio and video sources for public or social 
enjoyment. The input content may have been previously selected by the user to be 
suitable for the social occasion and the template or templates executed by the 

20 embodiment, including any sequencing or execution instructions pertaining to the 
templates themselves, may also have been preselected by the user. 
Second Preferred Embodiment of the Method 

The template definition made in the first embodiment may be extended to 
include the capability to store and convey and execute or direct the execution of a set of 

25 rules and associated information and content and functions, mappings and algorithms, or 
any combination or repetition or sequence thereof, where said rules and associated 
information, etc may have been created or defined or arranged by or created or authored 
under the direct or indirect control of an expert or experts or any person or persons skilled 
or experienced in the art or arts of multimedia presentation, movie-making, video 
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production, audio production, or similar. The purpose of such a template or templates is 
to convey and/or control the production or presentation or post-production of input 
content (single or multiple) provided by a user in order to deliver output content which 
may be perceived by a viewer as comparably positively improved or reduced of some 
5 negative aspects with respect to the unmodified input content (single or multiple). Such a 
template or templates may contain heuristics, expert systems, procedural rules, script or 
scripts, parameters, algorithms, functions, provided content including at least all types of 
input content previously described, or references to any of these, or even merely data 
parameters used to setup a machine or system equivalent to the embodiment. Said 

10 template or templates may also include information in the form of text, graphics, video, 
audio, etc capable of describing or approximately describing the action or intent of the 
template or of the template author to a user in order to allow said user the opportunity to 
make a selection of, or between, one or more templates. 

A practical purpose for said template or templates may include, for example, the 

15 processing of input content to create the appearance of a typical professionally-made 
video or movie from the content output by the embodiment. Similarly, it may be 
desirable to create a mood, or genre, or look, or other emotional or perceptible effect or 
feeling in content output by the embodiment from input content which does not include 
said mood, or genre or look, or other emotional or perceptible effect or feeling or not to 

20 the same degree, in the opinion of the user or the viewer, or both. Typically, post- 
production, or making of a video or movie requires team-work & a variety of skills for the 
capture process (acting, directing, script-writing, camera direction, etc) and for the post- 
production process (editing, directing, graphic-arts, music composition, etc). Typically 
this skill-set is unavailable to consumers and business people who may commonly use a 

25 camcorder, DVCamcorder, still-image camera, audio recorder, or the like. Portions of 
this team-work and skill-set may be compiled into a template form and made available to 
users of the embodiment in said template form. Since the input content has often already 
been captured, therefore reducing or limiting the ability of the embodiment, under control 
of the template, to affect the capture process, the skill-set contained or described or 
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compiled within the template is, in that instance, typically limited to controlling, directing 
or executing application of the embodiment to the post-capture process, as indicated in 
Fig. 1, wherein the said template or templates may replace or control or direct or execute 
or do any combination of these for the mapping processes 102 and 103. In other cases, 
5 however, a portion of the input content may not already have been captured. In this 
instance, that portion of the input content can be live, and the template can operate on, or 
with, or in conjunction with that live portion of input content. The extent of processing in 
relation to "online" content depends on the extent of processing power available in the 
^ embodiment. 

~ 10 Fig. 5 indicates the second preferred embodiment of the invention in which said 

'■#=3 

;3 template or templates may be used to control or direct or execute processing of input 

content in a manner equivalent to or derivative of typical movie or video or audio or 
! j J musical or multimedia post-production techniques in order to produce output content that 

□ may be perceived by a viewer as positively improved or reduced in negative aspects, 

y 15 Movie Director, 503, receives an input template or templates, 501, and input content 
O (singular or plural), 502. In this preferred embodiment, input content, 502, will typically 

include synchronised, parallel or coincident video and audio content such as delivered by 
a camcorder or DVCamcorder device, or still images or graphical content, or music or 
other audio content. Input content, 502, will also typically include information about the 
20 input content, also known as metadata, that may specify some temporal structure 
information concerning the input content. In this second embodiment it is assumed that 
the said information about the temporal structure of the said input content is similar in 
type and quantity to that described in the first preferred embodiment. 

Movie Director, 503, analyses the rules and other elements contained within 
25 template(s) 501 and constructs a series of instructions, 504, suitable for interpretation 
and/or execution by movie builder 505. The series of instructions, 504, in the form of a 
render script, typically also containing aspects of an edit decision list (EDL), is compiled 
by the movie director, 503, to typically also include references to input content, 502, and 
also possibly references to content provided by the template(s), 501, and possibly also 
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references to other elements or entities including functions, algorithms, and content 
elements such as audio, music images, etc, as directed by the template(s), 501. Typically, 
template(s) 501, will direct the movie director, 503, to select and describe one or 
mappings for structure or effects to be applied to the input content, 502, or to said other 
5 provided or referenced content. Typically the template(s), 501, may have insufficient 
information for the movie director, 503, to resolve all references concerning the input 
content, 502. Typically, these unresolved references may be due to insufficient 
information to determine which of the input content is to be operated on by the 
embodiment, or the location of the input content, or similar issues. Movie Director, 503, 

10 may obtain sufficient information to resolve these issues by requesting or waiting for 
input by a user via a user interface, 507. Typical input at 507 may include selection of 
one or more input content items or selection of portions of content to be made available at 
502 as input content, or selection of points of interest within content to be made available 
as metadata information to movie director 503. Movie director 503, using information 

15 sources, 501, 502, 507, outputs a render script, 504, with all references resolved within 
the system so that Movie Builder, 505, may find or execute or input the entirety of the 
referenced items without exception. 

Movie builder 505 may typically execute or obey render script 504 directly, as 
typically, movie director, 503, has been designed to output the render script, 504, in a 

20 format suitable for direct execution by movie builder 505. Movie builder 505 may read 
and execute render script contents, 504, in a series-parallel method, as is typically 
required for video and audio parallel rendering or post-production. Additionally, movie 
builder 505 may execute or obey render script 504 by any reasonable method that will 
yield the expected result, or the result intended by the author of template(s) 501. Movie 

25 builder 505 may typically be or include a video and audio renderer such as Quicktime 
3.0®, a product of the Apple® Corporation, or equivalent. Movie builder 505 may also 
be or include a hardware renderer or a combined hardware and software renderer and may 
be capable of realtime operation if a particular application of the embodiment so requires 
it. It may be noted that a difference between the first embodiment and the second 
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embodiment, as is visible when comparing Fig. 1 and Fig. 5 is that in the first 
embodiment in Fig. 1, the mapping processes may execute without first compilation and 
resolution of references, whereas in the second embodiment, the rendering processes, 
which include the mapping processes of the first embodiment, may be executed following 
5 compilation of a render script derived from a template and following resolution of 
references. Movie builder 505 may typically include or provide any or all of the video or 
movie or multimedia or audio editing and effects and related functions, algorithms, etc for 
execution according to the method, order, sequence, etc instructed by the render script 504 
and as intended or directed by the template 501. Movie builder 505 renders, edits, or 

10 otherwise modifies the input content, 502, and provided content (portion of 501) or other 
referenced content (possibly present in the system), according to the instructions, 
sequence, and referenced functions, etc included in render script 504 and outputs the 
completed production to, optionally, either or both of 506, a storage element, or 508, a 
movie player. The storage system, 506, may be used to store the production indefinitely, 

15 and may be a device including a camcorder, DVCamcorder, or hard disk drive, or 
removable medium or remote storage accessible via the internet or similar or equivalent 
or a combination of any of these wherein the output production may be partially stored on 
any or all of these or may be duplicated across any or all of these. Store 506 may 
optionally store the output production and does not restrict the possibility of the output 

20 production being played and displayed immediately by movie player 508 and display 509, 
nor does store 506 limit the possibility of movie builder 505 being capable of rendering in 
realtime and also playing the output production, in which case movie builder 505 and 
movie player 508 may be the identical component within the system of the embodiment. 
User interface 507 may also provide user control of movie builder 505 or of movie player 

25 508, if desired, to allow control of features or functions such as starting of execution of 
movie builder 505 or starting of playing by movie player 508, or stopping of either 505 or 
508 or both. User interface 507 may also permit a user to specify the location of store 
506, if it should be used or other related options. Movie player 508 may be or include 
Apple® Quicktime® 3.0 Tenderer or any other equivalent or similar movie player. 
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A specific example of an application of the second embodiment described in Fig. 
5 provides for the application of a Silent Movie template, via the system described in Fig. 
5, to a user's input content to produce output content that may be perceived by a viewer to 
be similar to or to evoke feelings of or impressions of the silent movie genre or style of 
5 movie or film. Said Silent Movie template includes rules or relations between separate 
mapping processes, said rules or relations being intended to group or direct the production 
of a particular perception or feeling within the content output from the system in Fig. 5. 
Said rules or relations may include passive relations or grouping of mapping processes by 
the method of including said processes within a single template and excluding other 
10 unwanted processes. Further, said relations between mapping processes, etc may be 
active, being rules or similar and capable of being executed or operated or decided during 
execution of the system in Fig. 5. 

Said Silent Movie template may include a typical set of template components 
listed in Table 5. There may be many ways to construct a template or to apply or order its 
15 components to achieve an equivalent result to that of the Silent Movie production and the 
example in Table 5 is not limiting on these many construction methods or options or 
orderings or applications. Said Silent Movie template example in Table 5 may be 
considered as an example of passive relationships between template components to 
achieve an overall production and consequent perception, as previously described. Many 
20 of the components listed in Table 5 may alone typically elicit some perception of the 
Silent Movie genre, but the combination or sum of these elements being coincident in one 
template and their sum effect on the input content result in a consequently strong 
perceptual reference or allusion to the Silent Movie genre. 

Appendix 1 includes an example implementation of the Movie Director module, 
,25 in pseudo-code. Appendix 2 includes an example implementation of the Movie Builder, 
also in pseudo-code. Appendix 3 includes an example template implementation, also in 
pseudo-code. The template in Appendix 3 has been designed to create a fast-paced, fast- 
cutting production with a fast-beat backing music track to give the impression of an action 
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movie. When the example in Appendix 3 is compared with the previous Silent Movie 
genre template description the versatility of the invention may recognised. 

Table 6 provides example associations between editing & effect techniques and 
template type, where each template type is intended to induce or suggest one or more 
5 moods or is intended for application to input content of a particular kind or relating to a 
particular event type. 

Templates need not be fixed, nor entirely previously authored. A template or 
templates may be modified through various means as part of the method of the 
embodiment, including: inclusion of user-preferences; direct or indirect user- 
10 modification; inclusion of information or inferences derived from information about input 
content; modification by or in conjunction with another template or templates; 
modification or grouping or consolidation or composition by a meta-template; template 
customisation. Modification of a first template, in conjunction with a second template 
can be facilitated by using standardised naming of template elements or parameters and, 
15 or in addition to standardised structuring of template information. 

Appendix 3 provides an example of standardised naming of template parameters 
and elements, e.g. c cut_order' and'intraclip'_spacing. Incorporation of standard names of 
this kind, or use of a format, structure or model inferring element and parameter names or 
identities, facilitates template modification. For example, the template in Appendix 3 
20 might be modified by direct means (manual or automatic) through searching the template 
information for a name or inferred element identity, and then replacing the related value 
text string, reference or other attribute associated with that name (if any). 

Another example of template modification, again with reference to Appendix 3, 
involves replacement or swapping element values or attributes between like-elements in 
25 different templates. For example, if a user, through direct or indirect means, indicates a 
preference for a 'Random' cut_order property from a differing template, but otherwise 
prefers all of the properties of a "Romantic" template, then the 'chronological' cut_order 
property in the Romantic template could be replaced by the 'Random' property from 
elsewhere. 
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Yet a further example of template modification involves prioritisation or 
weighting, of template elements to indicate their value to, or their influence on the overall 
impression of the template. This information, which is effectively template metadata, can 
be used to control, or coordinate, user or automatic template modification. The 
5 information can thus enable judgements to be made as to the effect and potential 
desirability of particular modifications to a template. 

The method of the embodiment may support a meta-template which is capable of 
acting on a collection of templates, including functions such as: selection of template(s) 
;a=s based on criteria such as information about input content, user preferences, etc; 

g 10 modification, grouping, consolidation or composition of a group of templates; 

customisation of one or more templates, typically under specific user control, 
if The method of the embodiment, through design and provision of suitable 

w templates, may be used to provide a presentation or album function when applied to, or 

3 operating on input content and/or content provided with a template. 

U 15 Third Preferred Embodiment of the Method 

3 The previous embodiments may be further enhanced or extended by the inclusion 

of user- interactivity or user input or user preferences or user setup or user history or any 
of these. It is especially preferred that templates include the capability for requesting user 
input or preferences, or of requesting such information or interaction directly or indirectly, 
20 and be able to include such information in decision-making processes such as the 
aforesaid rules, in order to determine, or conclude or infer and execute, apply or direct a 
production including at least some of the preferences or desires of the user. 

At its simplest, user interaction or control may include selection of a template or 
templates and selection of input content (single or multiple) for application, execution or 
25 direction of the former to the latter to output a production. 

Of particular interest is the opportunity for the embodiment, and of the 
template(s) therein, to utilise or enquire of the user's potential knowledge of the input 
content to presume or infer the best application, execution or direction of the template to 
said input content. This user-interaction and presumption or inference may be 
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implemented in a variety of methods, including the simultaneous implementation of 
several alternative methods. The kinds of knowledge of the input content that may be 
obtained include: user preference for, neutrality towards or dislike of one or more input 
content segments; user preference for or dislike of point or points within input content 
5 segments; user preference for, neutrality towards or dislike of similarly-labelled sections 
of input content, for instance, where a database of labelled or partially labelled input 
content may be accessible to the embodiment; user approval or disapproval of an output 
production or portion or portions thereof. 

Fig 6A. indicates in 600, a possible method of obtaining knowledge of input 

10 content from a user. The user may be asked or prompted to indicate highlights or 
emotionally significant or otherwise important portion or portions of input content. One 
method of implementing this interaction is to allow the user to indicate a point of 
significance, 605, and the embodiment, typically through the application of rules within 
the template(s) may infer zones, 610 and 611, before and after the point of interest that 

15 may be selected and extracted for inclusion in the production and application by the 
template. Typically, the durations, 606, 607, of the zones of interest, 610, 61 1, around the 
indicated point, 605, may be determined or defined or calculated within the template, 
typically by authored heuristics indicating the required or desired extracted content length 
for any time-based position within the output production. 

20 Fig. 6B. indicates, 620, a possible method for a user to indicate approval or 

disapproval of portion or portions of the output production, 621. The user may indicate a 
point of approval or disapproval, 625, and this point information may be inferred to 
indicate an entire segment of the output production, 630, said segment typically being 
extrapolated from said point by means of finding the nearest forward and backward 

25 content boundaries (transitions) or effects, or by applying a heuristically determined time- 
step forward and backward from 625 that typically relates to the temporal structure of the 
output production. 

User interaction may also permit direct or indirect alteration or selection of 
parameters or algorithms or rules to be utilised by the template(s) by means including: 
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selection of numerical values for quantities such as clip duration, number of clips, etc; 
indirect selection of clip duration or temporal beat or number of clips through selection of 
a particular template with the desired characteristics or through indicating preference for 
the inclusion of a particular clip of a known duration, therefore potentially overriding 
5 template rules relating to selection of such content; selection from a set of style options 
offered by a template as being suitable (said suitability typically being determined 
heuristically or aesthetically and authored into said template); selection of a method or 
methods, such as a clip selection method preferring to select content from a localised 
region of the input content. A template may provide means and especially rules for 

10 requesting all such information or options or preferences or for indirectly requesting said 
information or for allowing user-selection of said information. A template may not 
require said information but may assume defaults or heuristics, etc unless said information 
is offered, or directed by the user. The act by a user of selecting a template may define or 
determine some or all of said options or parameters or selections by means of defaults or 

15 default methods being within said template. 

A template may offer a user a hierarchical means or other equivalent or similar 
means for selecting, modifying or controlling or specifying parameters, methods, etc. Said 
hierarchical means of permitting or requesting user input may be implemented by initially 
requesting or offering generalisations for the production, for instance, the template 

20 selection may be the first generalisation, eg. selection from a wedding production 
template or a birthday production template. A next level of hierarchical selection may be 
the choice of a church wedding or a garden wedding production within the template. Said 
choice may effect a change to styles and colours or music or related rules and methods 
within the template. A next level of hierarchical selection may be the choice of music or 

25 styles within a segment of the production relating to a particular input content segment. 
Thus, if the user is willing or interested or skilled enough to wish to specify detailed 
control of a production and the directing template then a hierarchical method may be 
appropriate for permitting such control where it is requested or required without 
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demanding or enforcing the same level of detailed control for the entirety of a production 
where it may be unnecessary or undesirable. 

Further examples of user input to or control of or interaction with template rules 
include: choice of long, medium or short temporal structure mappings; choice of clip 
5 durations; choice of backing music; inputting text information into generated titles or 
dialogue mattes or credits or similar; selection of clipart or effects or styles from a range 
of options offered or referenced by a template or system. Some specific user control 
examples relating to the template examples already described include: optional chroma 
bleed-in at a selected point within the Silent Movie production to obtain the benefit of 
: B 10 colour after the mood has first been set by the effects; textual input to dialogue mattes 

O within the Silent Movie template example and also into the titles and end titles in the 

''-4 

itf action template example. A further example of interaction with a user includes 

!xJ storyboard interaction in which the embodiment, and desirably a template(s) may include 

□ and display features of a storyboard, including images, representative images, icons, 

y 15 animation, video, a script, audio, etc to convey properties and organisation of the template 
q and/or production to a user to permit or enable easy comprehension, to assist or guide the 

^ user and to permit or enable easy modification, adjustment, editing, or other control of the 

embodiment and/or the output production by the user. 

User preferences, historical selections or choices, name, and other information 
20 may also be recalled from previous use and storage, etc to be utilised by template rules or 
as elements within a production (eg. as textual elements in titles, etc). Also, previously 
created productions, previously utilised or modified templates, previously utilised or 
selected input content, or groups of these may be recalled for subsequent use as input 
content or preferences or templates, etc. 
25 Fourth Preferred Embodiment of the Method 

A fourth embodiment of the invention is also capable of using information about 
the input content in order to: select or adjust appropriate items or parameters to suit or 
match or fit the mood of the subject or other property of input content, be said input 
content any of audio, video, still-frames, animation, etc; advise, prompt or hint to the user 



(CFP1468US VP01 ) (436660) 



[l:\ELEC\CISRA\VP\VP01\etlitUS.doc 



-23 - 



a selection, choice, single option, alternative, parameter range, style, effect, structure, 
template, etc, for the user to operate on. 

This said capability of the embodiment to use information about the input 
content may be used, engaged or activated in conjunction with any of the previously 
described embodiments or examples of the invention in any reasonable combination of 
functions or features. 

The information about the input content may be obtained from an external 
source, such as a DVCamcorder, for instance, Canon model Optura, which is capable of 
performing some content analysis during recording of said content. Said external source 
may provide said information, also described as metadata, from content analysis or a 
recording of results from an earlier content analysis operation, or metadata may be 
supplied from other information also available at the time of recording the content. Such 
other information may include lens and aperture settings, focus information, zoom 
information, and also white balance information may be available. Information made 
available by said external source may include motion information, for instance, the said 
DVCamcorder is capable of providing motion information as part of its image encoding 
method and this information may be extracted and used for other purposes such as those 
described for this embodiment. Further examples of other information that may also be 
available from the external input content source include: time, date or event information, 
for instance, information describing or referencing or able to be linked to a particular 
event on a particular day such as a sporting event; locality or geographical information, 
including GPS (Global Positioning System) information. 

The embodiment may also be capable of analysing input content and providing 
its own information source or metadata source. Such analyses may be performed to 
obtain metadata including these types: audio amplitude; audio event (loud noises, etc); 
audio characterisation (eg. identifying laughter, voices, music, etc); image motion 
properties; image colour properties; image object detection (eg. face or body detection); 
inferred camera motion; scene changes; date, time of recording, etc; light levels; voice 
recognition; voice transcription; etc. 
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The more that the embodiment is capable of inferring about the subject or action 
or other details of a scene or event recorded within input content then the more capable 
the embodiment may be to perform template functions, including: searching, selection and 
extraction of clips from input content, for instance, to find events of interest or 
5 appropriate to the mood of the applied template; to infer relationships between portions of 
input content and to maximise benefits from these through control of clip-adjacency, 
chronology, subject selection frequency, etc; to infer the subject of input content in order 
to select an appropriate template, or function or effect within a template; to select 
appropriate transition properties, eg. colour, speed, type, based on information about the 

10 input content such as colour, motion and light level. 

The embodiment may also include the capability to access and search input 
content by means of labels applied to or associated with or referencing the input content. 
Said labels may have been applied by the embodiment itself or by any other source. Said 
labels may be applied in patterns to label portions of input content according to any rule 

15 method required or appreciated by the user or by any source acting on the user's behalf or 
under the user's instructions. Thus, an input content section may contain labels 
describing specific subjects within the content, such as the user's family, and the 
embodiment may utilise these labels to select or avoid selecting said labelled portions 
based on instructions provided to the embodiment. Said instructions need not be provided 

20 directly by a user. For example, the user may select a template which has been previously 
defined, and is declared to the user through an appropriate mechanism, eg. the template 
name, to select for family snapshots or video clips. Said family snapshot template may 
include reference to a labelling scheme which permits it to interpret the labels previously 
placed on or associated with the input content, therefore the embodiment need not require 

25 direct user action or control concerning the labelled input content. The embodiment may 
support the user's optional wish to override or modify the labels, thereby allowing control 
of the current production process and possibly of future production processes if permanent 
modifications are retained for the labels. 
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Preferred Embodiments of Apparatus 

The method of multimedia editing is preferably implemented in dedicated 
hardware such as one or more integrated circuits performing the functions or sub 
functions of the editing. Such dedicated hardware may include graphic processors, digital 
signal processors, or one or more microprocessors and associated memories. An 
apparatus incorporating the aforementioned dedicated hardware can be a device (for 
example a DVCamcorder or other device) having random access storage, and replay 
capability, plus at least some editing, and possibly effects capability. The DVCamcorder 
has, in addition, a communications capability for exchange/transfer of data and 
control/status information with another machine. The other machine can be a PC running 
a control program such as a template having the user interface. The PC and 
DVCamcorder can exchange video/audio data, as well as control and template 
instructions. The machines can operate in a synchronous, or asynchronous manner as 
required. 

The multi-media editing processes are alternatively practiced using a 
conventional general-purpose computer, such as the one shown in Fig. 7, wherein the 
processes of Figures 1 to 6 are implemented as software executing on the computer. In 
particular, the steps of the editing methods are effected by instructions in the software that 
are carried out by the computer. The software may be divided into two separate parts; one 
part for carrying out the editing methods; and another part to manage the user interface 
between the latter and the user. The software may be stored in a computer readable 
medium, including the storage devices described below, for example. The software is 
loaded into the computer from the computer readable medium, and then executed by the 
computer. A computer readable medium having such software or computer program 
recorded on it is a computer program product. The use of the computer program product 
in the computer preferably effects an advantageous apparatus for multi-media editing in 
accordance with the embodiments of the method of the invention. 

The computer system 700 consists of the computer 702, a video display 704, and 
input devices 706, 708. In addition, the computer system 700 can have any of a number 
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of other output devices including line printers, laser printers, plotters, and other 
reproduction devices connected to the computer 702. The computer system 700 can be 
connected to one or more other computers via a communication interface 710 using an 
appropriate communication mechanism such as a modem communications path, a 
5 computer network, or the like. The computer system 700 can also be optionally 
connected to specialised devices such as rendering hardware or video accelerators 732 by 
means of communication interface 710. The computer network may include a local area 
network (LAN), a wide area network (WAN), an Intranet, and/or the Internet 

The computer 702 itself consists of one or more central processing unit(s) 

10 (simply referred to as a processor hereinafter) 714, a memory 716 which may include 
random access memory (RAM) and read-only memory (ROM), input/output (IO) 
interfaces 710, 718, a video interface 720, and one or more storage devices generally 
represented by a block 722 in Fig. 7. The storage device(s) 722 can consist of one or 
more of the following: a floppy disc, a hard disc drive, a magneto-optical disc drive, CD- 

15 ROM, magnetic tape or any other of a number of non- volatile storage devices well known 
to those skilled in the art. Each of the components 720, 710, 722, 714, 718, 716 and 728 
is typically connected to one or more of the other devices via a bus 1024 that in turn can 
consist of data, address, and control buses. 

The video interface 720 is connected to the video display 704 and provides video 

20 signals from the computer 702 for display on the video display 704. User input to operate 
the computer 702 is provided by one or more input devices. For example, an operator can 
use the keyboard 706 and/or a pointing device such as the mouse 708 to provide input to 
the computer 702. 

The system 700 is simply provided for illustrative purposes and other 
25 configurations can be employed without departing from the scope and spirit of the 
invention. Exemplary computers on which the embodiment can be practiced include 
EBM-PC/ATs or compatibles, one of the Macintosh (TM) family of PCs, Sun Sparcstation 
(TM), or the like. The foregoing is merely exemplary of the types of computers with 
which the embodiments of the invention may be practiced. Typically, the processes of the 
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embodiments, described hereinbefore, are resident as software or a program recorded on a 
hard disk drive (generally depicted as block 726 in Fig. 7) as the computer readable 
medium, and read and controlled using the processor 714. Intermediate storage of the 
input and template data and any data fetched from the network may be accomplished 
5 using the semiconductor memory 716, possibly in concert with the hard disk drive 726. 

In some instances, the program may be supplied to the user encoded on a CD- 
ROM 728 or a floppy disk 730, or alternatively could be read by the user from the 
network via a modem device 712 connected to the computer, for example. Still further, 
the software can also be loaded into the computer system 700 from other computer 

10 readable medium including magnetic tape, a ROM or integrated circuit, a magneto-optical 
disk, a radio or infra-red transmission channel between the computer and another device, 
a computer readable card such as a PCMCIA card, and the Internet and Intranets including 
email transmissions and information recorded on websites and the like. The foregoing is 
merely exemplary of relevant computer readable mediums. Other computer readable 

15 mediums may be practiced without departing from the scope and spirit of the invention. 

The foregoing only describes a small number of embodiments of the present 
invention, however, modifications and/or changes can be made thereto by a person skilled 
in the art without departing from the scope and spirit of the invention. The present 
embodiments are, therefore, to be considered in all respects to be illustrative and not 

20 restrictive. 
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